-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Integration Test Framework] fix createTempDir and flaky tests #5409
Conversation
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make lint happy, otherwise looks good.
Folks, I fixes some flaky tests (see c187045) that were also affecting this PR. Could you review those changes as well? |
createTempDir register a test cleanup function to remove the folder it created, however, on Windows, this folder sometimes fails to be removed because there are still open file handlers for the files within the folder. We fix this problem by retrying to remove the folder with a maximum overall wait time of 3 seconds. This is a very similar approach to what Go's t.TempDir does.
Remove the custom removeAll function in favour of the install.RemovePath that already exists.
Fix the flakiness from TestUpgradeHandler* tests by re-working the mockUpgradeManager, now it accepts a function for its Upgrade method and their implementation is goroutine safe
72924bc
to
9d98ac8
Compare
|
createTempDir register a test cleanup function to remove the folder it created, however, on Windows, this folder sometimes fails to be removed because there are still open file handlers for the files within the folder. We fix this problem calling install.RemovePath that will retry removing the folder for about 2s. This is a very similar approach to what Go's t.TempDir does. Fix the flakiness from TestUpgradeHandler* tests by re-working the mockUpgradeManager, now it accepts a function for its Upgrade method and their implementation is goroutine safe (cherry picked from commit 1242e71) # Conflicts: # internal/pkg/agent/application/actions/handlers/handler_action_upgrade_test.go # pkg/testing/fixture.go # testing/integration/event_logging_test.go
createTempDir register a test cleanup function to remove the folder it created, however, on Windows, this folder sometimes fails to be removed because there are still open file handlers for the files within the folder. We fix this problem calling install.RemovePath that will retry removing the folder for about 2s. This is a very similar approach to what Go's t.TempDir does. Fix the flakiness from TestUpgradeHandler* tests by re-working the mockUpgradeManager, now it accepts a function for its Upgrade method and their implementation is goroutine safe (cherry picked from commit 1242e71)
TestLogIngestionFleetManaged was failing because the namespace generated by the integration tests framework was not unique among different tests and test runs, so sometimes collisions would occurs causing some tests to be flaky. TestDebLogIngestFleetManaged was failing because it also has got Beats logging connection errors before receiving the configuration from Elastic-Agent, now this message is also in the allow list. When testing .deb the AGENT_KEEP_INSTALLED environment variable is respected. When an integration test fails, the work directory created by the framework is now kept and its path is printed. createTempDir register a test cleanup function to remove the folder it created, however, on Windows, this folder sometimes fails to be removed because there are still open file handlers for the files within the folder. We fix this problem by retrying to remove the folder with a maximum overall wait time of 3 seconds. This is a very similar approach to what Go's t.TempDir does. Fix the flakiness from TestUpgradeHandler* tests by re-working the mockUpgradeManager, now it accepts a function for its Upgrade method and their implementation is goroutine safe TestEnvWithDefault Now TestEnvWithDefault unsets all environment variables it sets, allowing it to be run multiple times using -count. TestContainerCMDEventToStderr TestContainerCMDEventToStderr did not call agentFixture.Prepare early enough leading to an empty STATE_PATH env var, so all state information was in /usr/share/elastic-agent, which could make subsequent tests to fail because they could read /usr/share/elastic-agent/state/container-paths.yml and use a state path different than the one set in the test.
What does this PR do?
createTempDir register a test cleanup function to remove the folder it created, however, on Windows, this folder sometimes fails to be removed because there are still open file handlers for the files within the folder.
We fix this problem by retrying to remove the folder with a maximum overall wait time of 3 seconds. This is a very similar approach to what Go's t.TempDir does.
Fix the flakiness from TestUpgradeHandler* tests by re-working the
mockUpgradeManager, now it accepts a function for its Upgrade method
and their implementation is goroutine safe
Why is it important?
It fixes flakiness on Windows hosts
Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration files[ ] I have added tests that prove my fix is effective or that my feature works[ ] I have added an entry in./changelog/fragments
using the changelog tool[ ] I have added an integration test or an E2E testDisruptive User Impact
There is not user impact
How to test this PR locally
Run the integration test on Windows:
Related issues
Questions to ask yourself